Exploiting Metadata for Ontology-Based Visual Exploration of Weakly Structured Text Document

نویسندگان

  • Christian Seeling
  • Andreas Becks
چکیده

A large amount of strategically relevant business information is contained in unstructured texts. While information brokering approaches are used to contextualize such documents and to generate metadata, text mining is used to explore large document spaces. So far, little attention has been paid on a value-adding combination of these technologies. In this paper we show how metadata and documents can be complementarily represented and used interactively to support users in text corpus analysis. We present a text analysis portal which displays interdocument similarity by means of so-called document maps, complemented by a display of the domain ontology and metadata-based access methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

متن کامل

An Ontology-based Framework for Text Mining

Structuring of text document knowledge frequently appears either by ontologies and metadata or by automatic (un-)unsupervised text categorization. This paper describes our integrated framework OTTO (OnTology-based Text mining framewOrk). OTTO uses text mining to learn the target ontology from text documents and uses then the same target ontology in order to improve the effectiveness of both sup...

متن کامل

روش جدید متن‌کاوی برای استخراج اطلاعات زمینه کاربر به‌منظور بهبود رتبه‌بندی نتایج موتور جستجو

Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...

متن کامل

DEDUCE Clinical Text: An Ontology-based Module to Support Self-Service Clinical Notes Exploration and Cohort Development.

Large amounts of information, as well as opportunities for informing research, education, and operations, are contained within clinical text such as radiology reports and pathology reports. However, this content is less accessible and harder to leverage than structured, discrete data. We report on an extension to the Duke Enterprise Data Unified Content Explorer (DEDUCE), a self-service query t...

متن کامل

SATORI: a system for ontology-guided visual exploration of biomedical data repositories.

Motivation The ever-increasing number of biomedical datasets provides tremendous opportunities for re-use but current data repositories provide limited means of exploration apart from text-based search. Ontological metadata annotations provide context by semantically relating datasets. Visualizing this rich network of relationships can improve the explorability of large data repositories and he...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003